The purpose of this notebook is to discover a good location for the new restaurant in Phoenix. We will approach this analysis both considering success and number of reviews of existing restaurant in Phoenix and evaluating the competence this new restaurant chain will have in the city.
import pandas as pd
import folium
from folium import plugins
from folium.plugins import HeatMap
import numpy as np
import matplotlib.pyplot as plt
First, we load the data from cleaned_restaurants.csv and filter it by the metropolis of Phoenix.
cleaned_restaurants=pd.read_csv('cleaned_restaurants.csv')
p_metro=cleaned_restaurants[restaurants['state']=='AZ']
p_metro=p_metro[p_metro['latitude']<34]
We will create an objective function in order to catalogue both the success (stars) and number of reviews of the restaurants in phoenix. The function will ponderate by 30% the number of stars normalised and by 70% the number of reviews normalised as we consider more important for a good restaurant prioritize in number of potential customers.
p_metro['objective']=0.3*p_metro['stars']/max(p_metro['stars'])+0.7*p_metro['review_count']/max(p_metro['review_count'])
modeldata=pd.DataFrame()
modeldata['lat']=p_metro['latitude']
modeldata['lon']=p_metro['longitude']
modeldata['rating']=p_metro['stars']#.apply(lambda x: '1' if x>=4 else 0)
modeldata['reviews']=p_metro['review_count']
modeldata.head()
modeldata['objective']=0.3*modeldata['rating']/max(modeldata['rating'])+0.7*modeldata['reviews']/max(modeldata['reviews'])
modeldata.describe()
We will now just plot using a heatmap, those restaurants with an objective value higher than 0.55. That would mean that we will just plot those restaurants with the desired combination of reviews and stars.
from folium import plugins
from folium.plugins import HeatMap
heatmap= folium.Map(location=[33.451, -112.0578],
zoom_start = 11)
# Ensure you're handing it floats
modeldataheat=modeldata.copy()
modeldataheat['lat'] = (modeldataheat['lat']).astype(float)
modeldataheat['lon'] = (modeldataheat['lon']).astype(float)
heat_df = modeldataheat[modeldataheat.objective>0.55]
heat_df = heat_df[['lat', 'lon']]
heat_df = heat_df.dropna(axis=0, subset=['lat','lon'])
# List comprehension to make out list of lists
heat_data = [[row['lat'],row['lon']] for index, row in heat_df.iterrows()]
# Plot it on the map
HeatMap(heat_data).add_to(heatmap)
# Display the map
heatmap
from IPython.display import Image
Image("C:/Users/hecto.DESKTOP-ACA6A3T/Documents/DENMARK/BANALYTICS/2semester/Advance BA/Final Project/FINAL FILES/localization_1.PNG")
As we can see in the heatmap, the best businesses in the metropolis of Phoenix are located in the city center of Phoenix, Scottdale and Tempe. For a first restaurant, it would be better to place it in the city center of Phoenix since it has more customers and visibility.
Throughout this part, since we have the target customers in Phoenix, we want to visualize where they usually go in order to place the new restaurant next to those places.
business=pd.read_csv('reviews_rest_phoenix_metropolis.csv')
Target_users=pd.read_csv('future_customers.csv')
restaurants=business.merge(cleaned_restaurants,on='business_id',how='left')
restaurants=restaurants[restaurants['categories'].notnull()==True]
customers=Target_users['user_id'].unique()
List_restaurants=(restaurants.groupby(['business_id']).count())
List_restaurants=List_restaurants.rename(columns={'cool': 'Number of reviews'})
List_restaurants=List_restaurants['Number of reviews']
List_restaurants=pd.DataFrame(List_restaurants)
List_restaurants=List_restaurants.reset_index(level=['business_id'])
List_business_id=List_restaurants['business_id']
We will analyze in which restaurants the target users are going, it will inform us, in which area they are likely to go in Phoenix
frequency=[]
for B_id in List_business_id:
search=restaurants[restaurants['business_id']==B_id]
users_search=search['user_id'].unique()
rate=0
for user in users_search:
if user in customers:
rate=rate+1
frequency.append(rate)
The frequency, informs us, how many people of our sample of customers have been in each restaurant of Phoenix
List_restaurants['frequency']=frequency
List_restaurants=List_restaurants.sort_values(by=['frequency'],ascending= False )
List_restaurants.head(25)
We will only consider the restaurants where our target users have been
List_restaurants2=List_restaurants[List_restaurants['frequency']>0]
List_restaurants2=List_restaurants2.reset_index(drop=True)
List_restaurants2.head()
List_restaurants2.describe()
Let's put the localisation of these restaurants.
List_business_id2=List_restaurants2['business_id']
Lat=[]
Long=[]
for B_id in List_business_id2:
search=restaurants[restaurants['business_id']==B_id]
lat=search['latitude'].reset_index(drop = True)
long=search['longitude'].reset_index(drop = True)
Lat.append(lat[0])
Long.append(long[0])
List_restaurants2['latitude']=Lat
List_restaurants2['longitude']=Long
List_restaurants2.head()
We can now calculate the wheighted average longitude and latitde of these restaurants.
List_restaurants3=List_restaurants2.copy()
List_restaurants3['lat2']=List_restaurants3['frequency']*List_restaurants3['latitude']
mean_lat=List_restaurants3['lat2'].sum()/List_restaurants3['frequency'].sum()
print(mean_lat)
List_restaurants3['long2']=List_restaurants3['frequency']*List_restaurants3['longitude']
mean_long=List_restaurants3['long2'].sum()/List_restaurants3['frequency'].sum()
print(mean_long)
Let's now look at all of the restaurants in Phoenix The file Business_Metropolis can be downloaded by running the notebook: DescripitveAnalysis Yelp Dataset inside: https://github.com/hecmesge/ABA2020
Business=pd.read_excel('Business-Metropolis.xlsx')
Business=Business[Business['categories'].notnull()==True]
Restaurants=Business[Business['categories'].str.contains('Restaurants')]
Restaurants=Restaurants[Restaurants['metropolis']=='Phoenix']
Restaurants.head()
List_business_id2=List_restaurants2['business_id']
Target_Restaurants=Restaurants[Restaurants['business_id'].isin(List_business_id2)]
Target_Restaurants.describe()
Target_Restaurants.head()
restaurants_Phoenix_target=Target_Restaurants[Target_Restaurants['distance metropolis']<=90].reset_index(drop = True)
long_target=restaurants_Phoenix_target['longitude']
lat_target=restaurants_Phoenix_target['latitude']
Restaurants only contain all of the restaurants in Phoenix
restaurants_Phoenix=Restaurants[Restaurants['distance metropolis']<=90].reset_index(drop = True)
long_P=restaurants_Phoenix['longitude']
lat_P=restaurants_Phoenix['latitude']
This map shows all of the restaurants in Phoenix (in black), all of the restaurants where our target users have been (in cyan) and the weighted average localisation of these restaurants (regarding the frequency of which our target users have been in these restaurants) in red.
# Phoenix location
lat_Phoenix = 33.4483771
lon_Phoenix = -112.0740373
map_P = folium.Map([lat_Phoenix, lon_Phoenix], zoom_start=10)
for i in range(len(long_P)):
# Circle marker
folium.CircleMarker([lat_P[i],long_P[i]], radius=1, color='black').add_to(map_P)
for i in range(len(long_target)):
# Circle marker
folium.CircleMarker([lat_target[i],long_target[i]], radius=1, color='cyan').add_to(map_P)
folium.CircleMarker([mean_lat,mean_long], radius=3, color='red').add_to(map_P)
map_P
from IPython.display import Image
Image("C:/Users/hecto.DESKTOP-ACA6A3T/Documents/DENMARK/BANALYTICS/2semester/Advance BA/Final Project/FINAL FILES/heat_1.JPG")
The following map is a heat map of the restaurants where our target users have been.
# Phoenix location
lat_Phoenix = 33.4483771
lon_Phoenix = -112.0740373
map_Phoenix = folium.Map([lat_Phoenix, lon_Phoenix], zoom_start=10)
for index, row in List_restaurants2.iterrows():
folium.CircleMarker([row['latitude'], row['longitude']],
radius=row['frequency'],
fill_color="#3db7e4", # divvy color
).add_to(map_Phoenix)
# convert to (n, 2) nd-array format for heatmap
matrix = List_restaurants2[['latitude', 'longitude']].as_matrix()
# plot heatmap
map_Phoenix.add_children(plugins.HeatMap(matrix, radius=15))
map_Phoenix
Image("C:/Users/hecto.DESKTOP-ACA6A3T/Documents/DENMARK/BANALYTICS/2semester/Advance BA/Final Project/FINAL FILES/heat_2.JPG")
List_restaurants2.head(10)
competitor1=restaurants_Phoenix[restaurants_Phoenix['business_id']=='JzOp695tclcNCNMuBl7oxA']
competitor1.head()
Four_Peaks_Brewing=restaurants_Phoenix[restaurants_Phoenix['name']=='Four Peaks Brewing']
Four_Peaks_Brewing.head()
This first competitor only has one restaurant in Phoenix
competitor2=restaurants_Phoenix[restaurants_Phoenix['business_id']=='85o8XAZDoPmRqHFhdJLAGg']
competitor2.head()
Song_Lynn=restaurants_Phoenix[restaurants_Phoenix['name']=='Song Lynn']
Song_Lynn.head()
competitor3=restaurants_Phoenix[restaurants_Phoenix['business_id']=='gQMAcDm8kv8ev7x2BshMwg']
competitor3.head()
Pho_Thanh=restaurants_Phoenix[restaurants_Phoenix['name']=='Pho Thanh']
Pho_Thanh.head()
Same for Pho Thanh
Let's now only consider the Vietname restaurants in Phoenix that have more than 4 stars and that are inside the list of restaurants visited by our target users. These restaurants are the main Competitors of Banh Mi Boys.
Vietnamese_Phoenix=restaurants_Phoenix[restaurants_Phoenix['categories'].str.contains('Vietnamese')]
Vietnamese_Phoenix2=Vietnamese_Phoenix[Vietnamese_Phoenix['stars']>3.5]
Main_competitors=Vietnamese_Phoenix2[Vietnamese_Phoenix2['business_id'].isin(List_business_id2)]
Main_competitors.head()
# Phoenix location
lat_Phoenix = 33.4483771
lon_Phoenix = -112.0740373
m_Phoenix = folium.Map([lat_Phoenix, lon_Phoenix], zoom_start=10)
for index, row in Main_competitors.iterrows():
folium.CircleMarker([row['latitude'], row['longitude']],
radius=5,
fill_color="#3db7e4", # divvy color
).add_to(m_Phoenix)
# convert to (n, 2) nd-array format for heatmap
matrix = Main_competitors[['latitude', 'longitude']].as_matrix()
# plot heatmap
m_Phoenix.add_children(plugins.HeatMap(matrix, radius=15))
m_Phoenix
Image("C:/Users/hecto.DESKTOP-ACA6A3T/Documents/DENMARK/BANALYTICS/2semester/Advance BA/Final Project/FINAL FILES/heat_3.JPG")
As we can see, the competitors are less present in the three areas where the demand for restaurants is really high. Near Scottsdale, Phoenix and Tempe.